Dynamic Discretization of Continuous Attributes
نویسندگان
چکیده
Discretization of continuous attributes is an important task for certain types of machine learning algorithms. Bayesian approaches, for instance, require assumptions about data distributions. Decision Trees, on the other hand, require sorting operations to deal with continuous attributes , which largely increase learning times. This paper presents a new method of discretization, whose main characteristic is that it takes into account interdependencies between attributes. Detecting interdependen-cies can be seen as discovering redundant attributes. This means that our method performs attribute selection as a side eeect of the discretization. Empirical evaluation on ve benchmark datasets from UCI repository, using C4.5 and a naive Bayes, shows a consistent reduction of the features without loss of generalization accuracy.
منابع مشابه
Hierarchical Discretization of Continuous Attributes Using Dynamic Programming
The area of Knowledge discovery and Data mining is growing rapidly. A large number of methods are employed to mine knowledge. Several of the methods rely of discrete data. However, most datasets used in real application have attributes with continuous values. To make the data mining techniques useful for such datasets, discretization is performed as a pre-processing step. Discretization is a pr...
متن کاملA dynamic-programming algorithm for hierarchical discretization of continuous attributes
Discretization techniques can be used to reduce the number of values for a given continuous attribute, and a concept hierarchy can be used to define a discretization of a given continuous attribute. Traditional methods of building a concept hierarchy from a continuous attribute are usually based on the level-wise approach. Unfortunately, this approach suffers from three weaknesses: (1) it only ...
متن کاملGlobal discretization of continuous attributes as preprocessing for machine learning
Real-life data usually are presented in databases by real numbers. On the other hand, most inductive learning methods require a small number of attribute values. Thus it is necessary to convert input data sets with continuous attributes into input data sets with discrete attributes. Methods of discretization restricted to single continuous attributes will be called local, while methods that sim...
متن کاملDiscretization of Continuous-valued Attributes and Instance-based Learning
Recent work on discretization of continuous-valued attributes in learning decision trees has produced some positive results. This paper adopts the idea of discretization of continuous-valued attributes and applies it to instance-based learning (Aha, 1990; Aha, Kibler & Albert, 1991). Our experiments have shown that instance-based learning (IBL) usually performs well in continuous-valued attribu...
متن کاملCompression-Based Discretization of Continuous Attributes
Discretization of continuous attributes into ordered discrete attributes can be beneecial even for propositional induction algorithms that are capable of handling continuous attributes directly. Beneets include possibly large improvements in induction time, smaller sizes of induced trees or rule sets, and even improved predictive accuracy. We deene a global evaluation measure for discretization...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998